AIbase
Home
AI Tools
AI Models
MCP
AI NEWS
EN
Model Selection
Tags
Multi-objective reward model

# Multi-objective reward model

Gpt2 Large Harmless Reward Model
MIT
A large GPT2 model trained on the Anthropic/hh - rlhf harmless dataset, specifically for harmful response detection or reinforcement learning from human feedback (RLHF).
Large Language Model Transformers
G
Ray2333
1,489
3
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
English简体中文繁體中文にほんご
© 2025AIbase